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ABSTRACT: Ligand based pharmacophore modelling (LBPM) of a group of 26 heterocyclic diamidine 
derivatives having clinical bioactivity against Trypanosoma brucei gambiense (TBG). A four point 
pharmacophore model of anti-parasitic diamidines has been developed. Positive ionic and aromatic 
features were identified as crucial features for showing bioactivity as DNA binders. A statistically 
feasible 3D QSAR model was developed for further evaluation of the developed pharmacophore model. 
The good predictive ability of the model has been examined by various statistical parameters like 
Q’=0.67, Ts, ,—0-96 value. This model can be used for in silico screening and designing of potent anti- 


parasitic molecules. 
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INTRODUCTION 

Diseases like trypanosomiasis, leishmaniasis, 
and malaria are now in a epidemic stage in 
humans and animals, caused about millions of 
fatality due to their transmission across 
geographical barriers and due to absence of 
potent drug for treatment (Fairlamb 2003, 
Bouteille et al., 2003). Moreover emergence of 
drug resistant species have made the scientific 
community attached with drug development to 
search for better options to have more advanced 
therapeutic edge against the drug resistant 
species. The phenomenon of drug resistance is 
almost inevitable and becoming clinically 
unmanageable. Hence developing anti-parasitic 


drug becomes an important aspect for better 
therapeutic point of view. 

Protozoan parasites exhibit a broad range of 
peculiarity, including  polycistronic 
transcription, trans-splicing of precursor 
mRNAs which is likely due to the early 
divergence of the eukaryotic lineage (Yeates 
2003). Mitochondrial DNA organization and the 
RNA controlling process are remarkable 
features of kinetoplastids, which consists of a 
single mitochondrion encircling a exclusive 
type of DNA organization called kinetoplast 
DNA (kDNA), consisting of thousands of 
interlocked circular DNA molecules, referred 
to as minicircles and maxicircles (Este’vez and 
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Simpson 1999, Liu et al. 2005). This kinetoplast 
DNA has been an important target in 
trypanosomiasis caused by Trypanosoma cruzi. 
DNA minor groove binders show a variety of 
bioactivity and is a effective strategy in rational 
drug design (Tidwell and Boykin 2003, Wilson 
et al. 2005). 

In the present study, a correlation between 
anti-parasitic activity with that of structural 
properties of heterocyclic diamidine has been 
analyzed employing ligand based 
pharmacophore model (LBPM) approach. The 
compounds used in this study possess structural 
similarity with bioactive, less toxic, orally 
available pro-drug showing clinical 
practicability up to phase II (Athri et al. 2006, 
Bailly and Chaires 1/998). The 3D 
pharmacophore alignment based study is 
especially very important for further 
quantification of structure activity relationship 
in the 3D space. Thus it is a very important step 
towards answering the structural basis of the 
activity of these molecules. 

However, pharmacophore analysis with 
respect to chemical, structural, topological and 
other properties provide an outline for designing 
new compounds. Moreover, the receptor 
interaction pattern and important features of the 
same can also be elucidated by this analysis. It 
is reported that DNA binding is involved in the 
bioactivity of diamidine derivatives that target 
infectious disease organisms (Henderson and 
Hurley 1995, Baraldi et al. 2004). The present 
scenario turns out to be case where the ligands 
bind to the minor groove of the AT-rich 
sequence of DNA selectively can change, 
modify or stop the transcription of a specific 
enzyme (Cory ef al. 1992, Wilson et al. 1998). 
In leishmania and trypanosomes, the 
mitochondrial kineto-plast DNA (kDNA) is the 


primary attack zone of diamidine (Shapiro and 
Englund 1990 and 1995.) The trypanosome 
kinetoplast, have recurring AT zone sequences 
form a specific and effective target for 
heterocyclic diamidines (Athri et al. 2006).The 
pharmacophore based 3D QSAR model will 
help us to gain insight about the structural 
requirements for better activity. 

For the successful generation of the 3D 
QSAR model we have used PHASE (Dixon 
etal. 2006) algorithm of Schrodinger molecular 
modeling suite. A statistically significant 3D 
QSAR model which may be further used for 
development of new effective molecules or 
finding a druggable hit out of a virtual screening 
process of drug development on the basis of 
this 3D QSAR model. 


MATERIALS AND METHODS 

1. Data sets and Biological activity: 

26 heterocyclic diamidine derivatives from 
the literature by Athri et al. (2006) were 
selected. The molecules were selected 
considering their IC,, values to be precise and 
not in range. The insoluble compounds were 
discarded. The biological activities taken for 
study were converted to pIC,, values, using 
Gaussian statistics the distribution of the 
activity were shown (Fig.1) to get a more 
significant figure for visualizing bioactivity. 
Thus out of 26 heterocyclic diamidine 
derivatives, 70% compounds were taken as 
training set and the rest of the compounds were 
used as a test set. All the 2D molecular 
structures were developed using Chem Draw 
Ultra 8.0 and then were transformed into 3D 
structures. Energy minimization was done using 
OPLS 2005 force field of Ligprep module. The 
pH of ionization and other parameters of the 
molecules were adjusted and calculated from 
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default settings of aforesaid module. For the 
development of the pharmacophore model these 
ligands were imported in PHASE working 
interface. Conformation generation is an 
important step in PHASE algorithm; 
conformations were generated using Configen 
taking GB/SA solvent treatment model. About 
10,000 conformers were generated per structure 
and ensuing 100 step preprocess and 50 step 
post process minimization by varying the 
conformations of amide bonds. The minimized 
conformers were filtered using a relative energy 
parameter limitation of 10 kcal mol! and a 
minimum atom deviation of 1.00 A. If there is 
any occurrence in which the energy in a 
conformer is higher than this limit, and then its 
incessant disposition was ensured. The 
superfluous conformers were eliminated based 
on RMSD filter window of 0.5A for further 
refinement. Thus we successfully incorporated 
only the lowest energy conformation of a ligand 
in the process of pharmacophore model 
development. A couple of conformer was 
defined as identical if the distance between them 
is below 1.00 A. 


2. Creating Pharmacophore Sites and 
Common pharmacophore hypothesis 
generation: 

According to the pIC,, values the molecules 
were divided into active and inactive setting the 
maximum and minimum values in the activity 
threshold window of PHASE. Pharmacophore 
sites of a ligand are represented in the 3D space 
by a set of points. These points coincide with 
various chemical characteristics with type, 
location and directionality, which facilitate non 
covalent bonding with the receptor sites. The 
pharmacophore features like hydrogen bond 
acceptor (A), hydrogen bond donor (D), 
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hydrophobic/Non-polar group (H), negatively 
ionizable (N), positively ionizable (P) and 
aromatic ring (R) present in the PHASE were 
used to create the pharmacophore sites for the 
energy calculated ligands. The following 
features were assigned using SMART queries. 
Tree based partition algorithm is used by 
PHASE for detection of common 
pharmacophore from a set of variants taking 
maximum tree depth 3. To find common 
pharmacophore PHASE algorithm use an 
exhaustive analysis of k-point pharmacophore 
match picked from the conformations of a set 
of active ligands on the basis of inter site 
distances, and then find all spatial arrangements 
of pharmacophore features those are common 
to at least 8 out of 10 active ligands. Thus the 
pharmacophores generated have matches across 
different set of actives eliminating the chance 
of its exclusiveness towards a small subset of 
ligands. The different pharmacophore 
hypothesis produced were further examined by 
using a scoring function so that it produced the 
best alignment of the ligands which are active 
yet also incorporating the features from the 
inactives to make the model more versatile. 


3. Scoring Pharmacophores according to 
actives and inactives: 

The pharmacophore hypotheses were scored 
pertaining to the active ligands. To ensure that 
no inappropriate pharmacophore is inside the 
survived pharmacophore models least squares 
site-to-site alignment is considered. Now the 
scoring of the pharmacophore hypotheses was 
done in relation to the information from the 
active ligands considering various geometric 
and heuristic factors .The alignment to a 
reference pharmacophore is considered 
according to RMSD of the site points and the 
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average cosine of the vectors keeping their 
tolerance 1.2 A and 0.5 respectively was set. 
To preferentially get the reference ligand from 
the most active set the ones scoring the upper 
10 % was considered for score calculation. For 
further refinement volume scoring is also done 
in order to measure quantitatively of how each 
non-reference ligand is superimposing with the 
reference ligand, in account of Vander Waals 
models of the structures and taking into account 
all heavy atoms of the active ligands. Here the 
cutoff for volume scoring was kept at 1.00 for 
the nonreference pharmacophores. To ensure 
the lowest energy ligands for better binding to 
be incorporated in the best pharmacophore the 
relative conformational energy of the reference 
ligand was constrained to 0.001. Thus we 
generated the survival active scores for the 
pharmacophore hypotheses. A ligand can be 
inactive due to number of ways but for 
successful implementation of only those 
characters which are important for good binding 
we need to incorporate the knowledge why the 
inactive molecules are inactive. This would 
make our pharmacophore models a better one 
having ability to distinguish between an active 
and an inactive molecule. This score inactive 
is calculated with the help of fitness score which 
is assessed with the same constraints as that of 
score active. A good hypothesis has a low fitness 
score multiplied by a user adjustable factor 
which was set to default mode. For the 
development of pharmacophore model we 
considered the highest active molecule 11, 21, 
24. The pharmacophore fitness score 3 is the 
highest score observed for these ligands, which 
denotes how good a molecule fit with the 
pharmacophore hypothesis. These fitness score 
and bioactivity were correlated, when the 
distance and angle between each feature are 


variable, results the generation of ten 
pharmacophore hypotheses. The ranking is 
done for these models on the basis of all this 
said scores, considering the ability to 
distinguish between active and inactive ligands. 
The scores are calculated on the basis of 
contributions from the alignment of site points 
and vectors, volume overlap, selectivity, 
number of ligands matched, relative 
conformational energy, and activity. Acommon 
pharmacophore model (CPM) among the group 
of ten pharmacophore hypotheses were selected 
according to their maximum active and inactive 
contribution of features; followed by its fitness 
and stability based on the smart scoring 
function. For generation of CPM both active 
and in-active molecules are taken in 
consideration for further refinement, such that 
the generated pharmacophore model had the 
ability to distinguish between the active and 
inactive features of molecules. 


4. 3D-QSAR model generation: 

Atom based 3D QSAR model of 
pharmacophore hypotheses was generated by 
us. The atom based 3D QSAR model was 
chosen as our data set ligands showed quite very 
good alignment as it consisted of a large variety 
of derivatives of a parent molecule. The atom 
based 3D QSAR model provides more chemical 
significance than pharmacophore based 3D 
QSAR model which only depends upon 
pharmacophore sites for alignment to the 
hypothesis selected. This is because of the fact 
that atom based 3D QSAR takes the total 
molecule and facts like probable steric 
hindrance with the receptor site can be taken 
into account while building a model with this 
atom based approach. 

The PHASE algorithm uses a very versatile 
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ID Structure QSAR | PIC, |Predicted | Fitness 
Set pic, 
1 training | 2.29 2.09 2.32 
2 training | 1.16 1.43 2.33 
3 training | 1.63 1.92 2.35 
4 test | 1.45 1.58 2.29 
5 test | 1.35 1.61 2.33 
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ID Structure QSAR PIC,, Predicted | Fitness 
Set pic, 
6 training | 1.45 1.4 2.28 
7 training | 1.17 1.12 2.15 
8 training | 1.85 1.96 2.42 
9 training | 1.63 1.79 2.32 
10 training | 1.97 1.87 2.31 
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ID Structure QSAR | PIC, | Predicted | Fitness 
Set pIc,, 
11 test | 2.52 2.26 2.67 
12 test | 2.18 2.13 2.3 
13 training | 0.99 1.03 2.19 
\ 
14 ee training | 1.91 1.86 2.21 
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ID Structure QSAR | PIC ; Predicted | Fitness 
Set pIC 


50 


15 Hy , C L training 2 1.94 2.38 
ene 


NH 


training | 1.16 1.16 2.74 


O 


6 | s AIA 
- 


NH 


17 ¢ \ o O} e test | 1.72 1.35 | 2.68 


= 


18 training | 2.15 2.21 2.39 


19 test | 1.86 1.78 2.44 
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ID Structure QSAR PIC, Predicted | Fitness 
Set pIc,, 
20 training | 0.77 0.68 2.2 
21 training | 2.77 2.37 3 
22 ‘A a ™ training | 2.31 395 | 233 
f) SS 
Ha \ | 
23 7 . " training | 1.77 1.89 | 2.14 
a i 
a \ | 
24 7 i test | 2.68 2.07 | 2.51 
o SO" 
HN \ | 
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ID Structure QSAR | PIC_ |Predicted | Fitness 
Set pIC 


25 training | 2.39 2.52 2.34 


26 training | 2.28 2.28 2.38 


approach for the development of 3D QSAR Table 2: Distance between 
model. It considers a rectangular grid of 1 A pharmacophoric features. 


grid distance in a 3D space. Thus it creates : 
cubes of said dimension in the 3D space. The viel ie ewe 
atoms of the molecules which are considered P4 P3 13.995 
as overlapping Vander Waal spheres fall inside 

these cubes depending on the volume of the P4 R6 6.898 
atomic spheres. These occupied cube spaces are P4 R7 2.864 
termed as volume bits. A volume bit is allocated 

for each different class of atom that occupies a P3 R6 7.759 
cube. There are six atom classes two hydrogen 

bond acceptor (A), one positively ionizable (P) P3 R7 11.290 
and two aromatic ring (R) used for classifying R6 R7 4.036 
the atom characteristics. The total number of 


volume bits consigned to a specified cube is *PF= Pharmacophoric features 


Table 3: Statistical result of 3D QSAR model. 


ID PLS Factor | SD | R? RMSE | Q? | Pearson-R | R? 


pred 


PPRR.3 2 0.22 | 0.85 0.3 0.6 0.84 0.955 
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Fig. 1: Histogram showing activity distribu- 
tion of the dataset. 


Frequency 


at 1.3 1S 18 2 


phase activity 


23° 25 2 


based on how many training set molecules 
occupy that cube. A single cube may represent 
the occupation by one or various atoms or sites, 
and even those from the same molecule or may 
be from unlike molecules of the training set. 
Thus A molecule may be represented by a binary 
string concurrent to the occupied cubes, and also 
the various types of atomic sites that exist in 
those cubes. To create an Atom based QSAR 


Fig. 2: Structural alignment of the data set 
and pharmacophore. 
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model, these volume bits which encodes the 
geometries and chemical characteristics of the 
molecule are regarded as independent variables 
in PLS (Partial Least square) regression 
analysis. For generating a predictive QSAR 
model we have to select 3 number of PLS factor. 
The maximum PLS factor that can be taken is 
N/5 where N is the number of ligands present 
in the training set. 


RESULT AND DISCUSSION 

In our successful effort to develop a 
predictive and statistically significant 
pharmacophore based 3D QSAR model, we 
selected a pharmacophore hypothesis according 
to their highest survival active and inactive 
score that also shows good structural alignment 
(Fig.2) to the highest active molecules as well 
as had the ability to align to non model inactive 
ligands having less value than active threshold 
set for the model as well. It also illustrated 
diversified variants ensuring its uniqueness and 
selectivity. Thus we tried to avoid 
pharmacophore hypotheses with less variant 
contribution or less specificity or selectivity in 
its fingerprint. Out of 10 pharmacophore 
hypotheses the highest ranking was PPRR.3 
(Fig.3) calculated according to the effective 


Fig. 3: The pharmacophore hypothesis. 


J 


oO 
9S 


J 


Ligand based pharmacophore modeling and QSAR analysis of heterocyclic diamidine.. 


Fig.4: H-Bond donor contribution. 


scoring function already described in section 
2.3. Thus in the process of scoring and 
validation based evolution among the 
pharmacophore alternatives, one got finally 
selected; it is a four point pharmacophore 
containing two variants; two positively 
ionizable (P) groups and two aromatic rings (R). 
The distance and angle between each 
pharmacophoric features are summarized in 
Table 2. This pharmacophore model was further 
exploited for aligning the ligands used in 3D 
QSAR model generation. 


1. Pharmacophore model 

Three compounds with highest activity from 
the total data set were selected for common 
pharmacophore hypotheses (CPH) generation. 
Using a tree-based partition algorithm requiring 
that all active compounds must match, 16 four 
featured probable common pharmacophore 
hypotheses were generated from the list of 


Fig. 6: Electron withdrawing contribution. 


Fig.5: Structure skeleton. 


variants. No common pharmacophore 
hypotheses were obtained for five and six 
common features. On applying the scoring 
function for four-featured common 
pharmacophore hypotheses using default 
values, 10 common pharmacophore hypotheses 
survived, belonging to the types PPRR. Training 
set compounds were aligned on these common 
pharmacophore hypotheses and analyzed by 
PLS analysis described in PHASE with ten PLS 
factors. 

A good alignment is the primary requirement 
for a finer QSAR analysis. However validation 
is an imperative aspect for QSAR analysis, as a 
matter of fact diffused distribution of characters 
and activities amount the training set is a 
prerogative,/0% of the total ligands were 
chosen in the training set and the rest was taken 
in test set (Table 1) based on the activity guided 
Hierarchical Clustering method for internal 
validation purpose. The statistical outcomes of 


Fig.7: Positive ionic contribution. 
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the model thus produced indicated its statistical 
significance and predictive ability. The PLS 
analysis results of the model thus produced 
among various hypotheses was the best further 
justified our selected pharmacophore was the 
fittest. The various statistical parameters 
R?(Correlation coefficient), Q?(q? for the 
predicted activity), SD (Standard 
Deviation), RMSE (Root mean square 
error), P(Significance level of variance 
ratio), F-Statistics, Pearson-R (correlation 
between the predicted and observed activity of 
test set), and aa , Which was calculated from 
the formula 

. , = (SD-PRESS)/SD 

where SD is the sum of the squared 
differences between the experimental biological 
activities of the test set ligands and mean of the 
experimental activities of the training set 
molecules and PRESS is the sum of squared 
differences between predicted and actual 
experimental activity values for every ligand 
in test set . To avoid over-fitting of the results 
for PLS factor 3 were used and sum up of these 
statistical parameters are shown in Table 3. 


2. Interpretation of Atom based 

3D QSAR model 

The ligand based 3D QSAR model out of 
the selected pharmacophore hypothesis thus 
produced showed the contributions of various 
chemical variants or atoms or groups to enhance 
activity or decrease the activity of ligands. For 
mapping and visualization of our atom based 
3D QSAR model result we interpreted the 
various physicochemical contributions 
responsible on the basis of 21 the highest active 
ligand as the template. Using the map we also 
correlated our predicted result with that of 
in vitro bioactivity and find the structural and 
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chemical basis for the activity of other active 
and inactive ligands under an activity threshold. 
Here we discussed the various chemical 
contributions of the ligands according to the 
model. 


i). H-Bond donor contribution: 

The color mapping of the ligand according 
to the model shows the H-bond donor (HBD) 
contributions (Fig.4). The blue cube regions of 
the ligands will show favorable contribution and 
the red cube regions are unfavorable for the 
contribution of H-bond donor property. That 
indicates that presence of HBD group like — 
NH, will be favorable for showing 
antitrypanosome bio-activity. Appearance of red 
color blocks around the =NH group of 
benzimidamido fragment indicates reduction of 
this group to —NH, group may be favorable for 
bio activity. This is clear from the features 
attaining from lowest active molecule 
20, contain N-(4-methoxycyclohexyl) 
benzimidamido residue. But in moderately 
active molecule 14, 26 blue color region come 
into sight around the —-NH, group of terminal 
phenylmethanamine and picolinimidamido 
residue indicates the importance of -NH, group. 
Moreover it can be distinctly observed that 
around position 1 and 2 of skeleton structure 
(Fig.5) a red cube is also there which refers that 
if there is aif H-bond donor atom it will have 
negative impact on bioactivity . It may be due 
to donor-donor field interaction at two hands 
attached to the position 1 and 2 explaining the 
fact we see that in compounds 
1,11,18,21,22,24,25,26 show high activity due 
to the presence of a H atom attached to N and 
no other HBD is present. While in ligands 13, 
14 and 20 due the presence of additional 
H-bond donor like N and O attached to a ring 
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around 1 and 2 positions show quite decreased 
activity around lor less due the effect of two 
H-bond donors nullifying each other. Based on 
this line of evidence it can be concluded that 
free -NH, HBD type of groups are essential to 
produce superior bioactivity. 


ii). Electron withdrawing contribution: 

The color map of (Fig.6) shows the 
contribution of electron withdrawing effect 
(EWE). The green cubes shows positive or 
favorable contribution and the red cubes shows 
negative contribution. The highest activity of 
the ligand is due to this positive contribution 
only. It’s clearly shown that the ligand is having 
tight fitting to the green cubes at the 
pharmacophore points having electron 
withdrawing groups at those positions 
contributing for its highest activity. Thus if a 
ligand can be designed in such a way that the 
following green regions contain some electron 
withdrawing group will show high activity. 
Thus the N atom at positions 1, 2, 22, 23 of 
(Fig.5) where the green cubes are placed 
quantify for the bioactivity of all the 
compounds. The different atoms as electron 
withdrawing groups at position 11 of the 
skeleton structure (Fig.5) cause change in the 
activity. In 1, 2, 21, 22,23,24,25 have O atom 
at the position show quite an increased activity 
others like 11 sulphur atom show less activity 
and 8, 18, 19 having N atom show slightly lower 
activity as the following groups have weak 
electron withdrawing contribution 
comparatively. The red cubes at position 1 of 
the skeleton explain the decreased activity of 
the compounds 2,3,4,5,6,7,9,16,19 due to the 
presence of various electron withdrawing 
groups at the unfavorable position. 


iii). Positive ionic contribution: 

The color map (Fig.7) shows the positive 
ionic contribution of the ligand by the model. 
The red cubes shows negative contribution and 
the violet cubes show the positive contribution 
to decrease and increase the activity 
respectively. It’s seen that the positions 1, 2, 
22, 23 of the structural skeleton (Fig.5) if 
positive ionic groups are present it will give 
negative activity. 


CONCLUSION 

Thus to summarize our result of the 3D 
QSAR model indicates it to be very predictive 
and statistically significant model. It may be 
helpful to design potent ligands in future for 
development of anti-trypanosomiasis drug. The 
structural insight will also be helpful to know 
about its receptor binding and bioactivity. 
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